BITS Meetings' Virtual Library:
Abstracts from Italian Bioinformatics Meetings from 1999 to 2013


766 abstracts overall from 11 distinct proceedings





Display Abstracts | Brief :: Order by Meeting | First Author Name
1. Bertocco E, Cannata N, Toppo S, Fontana P, Scannapieco P, Valle G
From sequence to function using links to ortholog genes
Meeting: BIOCOMP 2001 - Year: 2001
Full text in a new tab
Topic:

Abstract: Missing

2. Bresciani P, Fontana P, Toppo S, Velasco R
A knowledge-based interface for accessing biological databases
Meeting: BIOCOMP 2002 - Year: 2002
Full text in a new tab
Topic:

Abstract: Missing

3. Cannata N, Dioguardi R, Fontana P, Scannapieco P, Toppo S, Lanfranchi G, Valle G
An integrated knowledge-base of gene expression in human skeletal muscle
Meeting: BIOCOMP 2000 - Year: 2000
Full text in a new tab
Topic: Databanks

Abstract: We have build a solid scaffolding that can hold and connect muscle transcript sequencing data to functional data, expression profiles, genomic sequences and genetic diseases. The starting point is the wide collection of skeletal muscle ESTs produced at CRIBI, which are automatically analysed, filtered and stored in a SQL table (HSPD-EST). A schematic view of the organization of the data is shown in the figure. ESTs are assembled into clusters (HSPD-CLUSTER table), which are very transitory entities as they may change at every new assembly depending on the order that the ESTs were merged or on the presence of new variant isoforms determined by alternative splicing or paralogue genes. On the other hand, many transcripts have now been well characterised and therefore should be considered as stable entities. Therefore, we decided to implement a Transcript Integrated Table (TRAIT) of human skeletal muscle, that includes some of the established information that is already available. As can be seen in the figure, we have also implemented a Single-Transcript Integrated Table (STRAIT), where different transcripts are stored in different records, even if they come from the same gene, for instance after alternative splicing. Therefore, every single transcript is recorded in STRAIT, while TRAIT is used to link together those transcripts that originated from the same gene. When a new cluster is discovered, then a provisional STRAIT record is automatically created. Records become permanent after the addition of further information such as full length sequencing, functional studies and high density hybridisation experiments, which are currently performed in our laboratory. All the above information is organised under an SQL database management system, in a protected intranet environment, currently including more than 4,000 STRAIT records. All the tables are periodically translated into SRS databases and are accessible on the web at HYPERLINK "http://grup.bio.unipd.it/" . The full implementation of the other databases (shown in the figure in light blue) is currently under way. In particular, a series of scripts and automatic procedures have been developed, linking full and partial transcripts to genomic sequences in view of the release of the entire human genome sequence. Our scripts make use of programs such as Blast, GeneFinder and Sim4, to perform this analysis systematically on every transcript of our database. The identification of the genomic sequence allows a simple and exact localisation of the genes and gives an indication of the full length sequence, introns, exons, alternative splicing and promoter region. Similar systematic procedures are also under way to link our muscle transcripts to sequences from model organisms such as yeast, C. elegans, Drosophila and mouse.

4. Fontana P, De Mattè L, Cestaro A, Segala C, Velasco R, Toppo S
GORetriever: a novel Gene Ontology annotation tool based on semantic similaritiy for knowledge discovery in database
Meeting: BITS 2006 - Year: 2006
Full text in a new tab
Topic: Molecular sequence analysis

Abstract: Missing

5. Fontana P, Segala C, Toppo S, Moser C, Grando S, Valle G, Velasco R
Bioinformatics within the IASMA grape project: tools for data mining and sequences annotation
Meeting: BIOCOMP 2003 - Year: 2003
Full text in a new tab
Topic: Comparative genomics and molecular evolution

Abstract: Missing

6. Toppo S, Cannata N, Romualdi C, Fontana P, Laveder P, Lanfranchi G, Valle G
Muscle-TRAIT: an integrated platform for storage, annotation and retrieval of data related to muscle transcripts
Meeting: BIOCOMP 2002 - Year: 2002
Full text in a new tab
Topic:

Abstract: Missing

7. Toppo S, Fontana P, Cannata N, Scannapieco P, Bertocco E, Valle G
TRAIT: a database of transcripts expressed in human skeletal muscle
Meeting: BIOCOMP 2001 - Year: 2001
Full text in a new tab
Topic:

Abstract: Missing

8. Toppo S, Fontana P, Velasco R, Valle G, Tosatto SCE
FOX (FOld eXtractor): A novel protein fold recognition method using iterative PSI-BLAST searches and structural alignments
Meeting: BITS 2004 - Year: 2004
Full text in a new tab
Topic: Unspecified

Abstract: We present a novel fold recognition method based on the combination of detailed sequence searches and structural information. Presently the protocol implements two different approaches to assign the correct fold to the target protein sequence: the first is based on database secondary structure search and the second is based on iterative database sequence search. In the first phase a secondary structure prediction of the target is performed and based on the ConSSPred protocol. This prediction is used to search for hits against a database of known secondary structures extracted from PDB (using DSSP). The search is based on a two-step strategy: the first step is based on a Smith-Waterman local secondary structure similarity search with a specific substitution matrix optimized for secondary structure alignment. The second is based on a global alignment based on SSEA (Secondary Structure Element Alignment), as implemented in our program MANIFOLD, to refine the score and the alignment itself in the region extracted from the first step. At the end of the first phase a list of hits that share a similar secondary structure topology with the target sequence is extracted. The second phase is based on a modified protocol for scanning the sequence database called SENSER. In the beginning of the second phase, BLASTP is used to scan the target sequence against the NR database. These initial hits are clustered to reduce sequence bias and a seed alignment with 20 or fewer sequences generated. This step ensures that PSI-BLAST can be jump-started with a more sensitive initial profile, increasing its sequence diversity. PSIBLAST is run for four iterations (e-value inclusion threshold 10e-3) on the NR60 database of known sequences. NR60 is produced by applying the CD-HIT algorithm to cluster the NR database at 60% sequence identity. Sequences producing NR60 hits with the query are assigned either to the significant sequence space (e-value <= 10e-3) or the trailing end (e-value <= 10) for further use. The profile is used to search the PDBAA database of sequences with known structure. If a significant PDBAA hit (e-value <= 10) is found, the protocol proceeds to the back-validation step (see below). If no significant hit is found, or the hit does not back-validate, a new PSI-BLAST search, using the above "4+1" protocol on NR and PDBAA, is started for the highest ranking sequence (i.e. lowest e-value) in the significant sequence space. Sequences from NR60 matching the query are also assigned to either the significant sequence space or the trailing end. Significant PDBAA hits are again submitted to back-validation. If no significant PDBAA hit is recorded and the significant sequence space has been exhausted, then the protocol uses the trailing end sequences as additional starting points for PSI-BLAST searches. In contrast to previous sequences, which were assumed to be similar enough to the target to imply homology, these sequences are submitted to back-validation before proceeding to the "4+1" PSIBLAST protocol. The back-validation step consists in using PSI-BLAST to find the target starting from a different query sequence, found as described above. I.e. due to the asymmetric nature of PSI-BLAST, if sequence A finds sequence B it is not always the case that B also finds A. Sequences that back-validate are more likely to be correct hits. Once a sequence from PDBAA back-validates and its secondary structures is compatible with the one of the target sequence as found in the first phase, the protocol builds a target to template alignment and stops. The procedure described so far serves to identify a template structure for the target sequence. In order to produce an accurate alignment, HMMER is used to build a hidden Markov model (HMM) based on the HOMSTRAD sequence alignment. The target is then aligned to the template using this HMM. Preliminary results for the method indicate a clear increase in both detection rate and alignment accuracy for distantly homologous sequences. Presently FOX has been tested on Fischer-68 test set to compare its performance with standard PSI-BLAST searches, GenTHREADER and the original SENSER protocol. As expected the introduction of the secondary structure prediction of the protein target and the database secondary structure searches in the first phase have increased detection sensitivity and sensibility of the method compared to profile based searches as PSI-BLAST and SENSER protocol (Fig. 1). The performance is comparable to GenTHREADER showing that right template structure is always found in the top 50 hits as shown in Fig. 1. Further score optimization and development are required to definitely test the entire protocol and make the program available as a web-based server from our group's web site (http://protein.cribi.unipd.it/).



BITS Meetings' Virtual Library
driven by Librarian 1.3 in PHP, MySQLTM and Apache environment.

For information, email to paolo.dm.romano@gmail.com .